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Abstract 

In a previous report we have evaluated analytically the mutual information between the firing 
rates of N independent units and a set of multi-dimensional continuous+discrete stimuli, for a 



finite population size and in the limit of large noise. Here, we extend the analysis to the case of 

o, 

two interconnected populations, where input units activate output ones via gaussian weights and 
a threshold linear transfer function. We evaluate the information carried by a population of M 
output units, again about continuous+discrete correlates. The mutual information is evaluated 
solving saddle point equations under the assumption of replica symmetry, a method which, by 



taking into account only the term linear in N of the input information, is equivalent to assuming 
the noise to be large. Within this limitation, we analyze the dependence of the information on 
the ratio M/N, on the selectivity of the input units and on the level of the output noise. We 
show analytically, and confirm numerically, that in the limit of a linear transfer function and of a 
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small ratio between output and input noise, the output information approaches asymptotically the 
information carried in input. Finally, we show that the information loss in output does not depend 
much on the structure of the stimulus, whether purely continuous, purely discrete or mixed, but 
only on the position of the threshold nonlinearity, and on the ratio between input and output noise. 



I. INTRODUCTION 

Recent analyses of extracellular recordings performed in two motor areas of behaving 
monkeys have tried to clarify how information about movements is trasmitted and received 
from higher to lower stages of processing, and to identify distinct roles of the two areas in 
the planning and execution of movements JT[. Although this study failed to produce clearcut 
results, it remains interesting to try and understand, from a more theoretical point of view, 
how information about mult i- dimensional correlates of neural activity may be transmitted 
from the input to the output of a simple network. In fact, a theoretical study is still lacking, 
which explores how the coding of stimuli with continuous as well as discrete dimensions is 
transferred across a network. 

In a previous report the mutual information between the activity ('firing rates') of a 
finite population of N units ('neurons') and a set of correlates, which have both a discrete 
and a continuous angular dimension, has been evaluated analytically in the limit of large 
noise. This parametrization of the correlates can be applied to movements performed in a 
given direction and classified according to different "types"; yet it is equally applicable to 
other correlates, like visual stimuli characterized by an orientation and a discrete feature 
(colour, shape, etc.), or in general to any correlate which can be identified by an angle and 
a "type". In this study, we extend the analysis performed for one population, to consider 
two interconnected areas, and we evaluate the mutual information between the firing rates of 
a finite population of M output neurons and a set of continuous+discrete stimuli, given that 
the rate distribution in input is known. In input, a threshold nonlinearity has been shown 
to lower the information about the stimuli in a simple manner, which can be expressed as 



a renormalization of the noise |2j . How does the information in the output depend on the 
same nonlinearity? How does it depend on the noise in the output units? Is the power to 
discriminate among discrete stimuli more robust to transmission down one set of random 
synapses, than the information about a continuously varying parameter? 

We address these issues by calculating the mutual information, using the replica trick and 
under the assumption of replica symmetry (see for example ||). 

Saddle point equations are solved numerically. We analyze how the information trasmis- 
sion depends on the parameters of the model, i.e. the level of output and input noise, on 
the ratio between the two population sizes, as well as on the tuning curve with respect to 
the continuous correlate, and on number of discrete correlates. 

The input-output transfer function is a crucial element in the model. The binary and 
the sigmoidal functions used in many earlier theoretical and simulation studies Qi] fail to 
describe accurately current-to-frequency transduction in real neurons. Such trasduction is 
well captured instead, away from saturation, by a threshold-linear function ||, ||. Such a 
function combines the threshold of real neurons, the linear behaviour typical of pyramidal 
neurons above threshold, and the accessibility to a full analytical treatment |7|, |J, as demon- 
strated here, too. For the sake of analytical feasibility, however, we take the input units to 
be purely linear. Therefore it should be kept in mind, in considering the final results that 
the threshold nonlinearity is only applied to the output units. 



II. THE MODEL 

In analogy to the model studied in we consider a set of N input units which fire to an 
external continuous+discrete stimulus, parametrized by an angle •& and a discrete variable 
s, with a gaussian distribution: 
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p({Vj}\$, s) = J! n> — 2 6Xp ~ [fa ~ ^' S ^ l 2o \ ' ^ 



77j is the firing rate in one trial of the j th input neuron, while the mean of the distribution, 
%($, s) is written: 

^,a) = eJ%W + (l-eJV; (2) 

% (0-<$>°)=^cos^-^J; (3) 

where e{ is a quenched random variable distributed between and 1, ft® is the preferred 
direction for neuron i. According to eq.(0) neurons fire at an average firing rate which 
modulates with $ with amplitude e s , or takes a fixed value rf , independently of $, with 
amplitude 1 — e s . 

We assume that quenched variables are uncorrected and identically distributed across 
units and across the K discrete correlates: 

e({4})=n *&) = 1*00]"* w 



Q{m) = W°f ' 



(27T) 
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In it has been shown that a cosinusoidal shaped function as in eq. (0) is able to capture 
the main features of directional tuning of real neurons in motor cortex. Moreover it has 
been shown that the presence of negative firing rates in the distribution ([!]), which is not 
biologically plausible, does not alter information values, with respect to a more realistic choice 
for the firing distribution, in that it leads to the same curves except for a renormalization of 
the noise. 

Output neurons are activated by input neurons via uncorrected gaussian connection 
weights Jij. 

Each output neuron performs a linear summation of the inputs; the outcome is distorted 
by a gaussian distributed noise 5i and then thresholded, as in the following: 

& = [C° + E c ^^ + ^] + ; i = l..M,j = l..N (5) 



In eq.([5p £? is a threshold term, Cy is a (0,1) binary variable, with mean c, which expresses 
the sparsity or dilution of the connectivity matrix, and 



((J«) 2 ) = *5; (J t3 ) = 0- (6) 



<(*) 2 >=<# W = 0; (7) 



p(cy = l) = c; 

p(cy = 0) = 1-c; (8) 



[ x ]+ = xG(x). (9) 

III. ANALYTICAL ESTIMATION OF THE MUTUAL INFORMATION 

We aim at estimating the mutual information between the output patterns of activity 
and the continuous+discrete stimuli: 

/({&}, *®*) = (e fd*f n <%?(*, sMikm 8 ) io g2 p({ ^Ax a) ) ; (io) 
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where the distribution p({^,i}\{f]j}) is determined by the threshold linear relationship fl5|), 
p({%-}|i?, s) is given in eq.(|l]) and (..} e> &>,c,j,s is a short notation for the average across the 
quenched variables {£s},{i?°}, {«/#},{%} and on the noise {Si}. We assume that the stimuli 
are equally likely: p(-&, s) = l/2irK. 
Eq.(10) can be written as: 

/({&}, * ® *) = #({&}) " (# ({6}|*, *)>- . (12) 



with: 



(#({&}M>,V = (E/^/lI^^(^ s )P(te}iA«)log 2 p(te}i^5)\ ; (13) 
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The analytical evaluation of the equivocation (H({C,i}\{}, s))^ can be performed inserting 
eq . (|TTD in the expression ([13|) , and using the replica trick to get rid of the logarithm: 

<tf({&}|<M)>*,„ = 



J,a J,a 
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To take into account the threshold-linear relation (|5|) we consider the following equalities: 



d£il[p(£i\{n?}) = IlPte = 0\{rff}) + / d^IIPtelW}) = 
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Inserting eq. fll6D in eq . (p~5|) one obtains: 
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(17) 



The average across the quenched disorder c,J,5 in eq.([T7|) can be performed in a very 



similar way as shown in ||: using the integral representation for each 5 function, gaussian 
integration across J, 5 is standard; the average on c can be performed assuming large the 
number N of input neurons. The final outcome for the equivocation reads: 
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where we have put c — > C/N. 

Integration on {x a } is straightforward. Integration on {rjf} can be performed introducing 
[n + l) 2 auxiliary variables z a p = jj> Ej VfVj Yl?L & functions expressed in their integral 
representation. Considering the expression ([!]) for the input distribution and with some 
rearrangement of the terms the final result can be expressed as: 
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where: 



^a/3 = 8 a/3 + 2(7 %Z a ^\ 



G a (i = o- 5 5 al 3 + C a j z al3 . (20) 

The evaluation of the entropy of the responses H({£i}), eg. (|Hj) , can be carried out in a 
very similar way, introducing replicas in the continuous+discrete stimulus space. The final 
result reads: 



H({£i}) = Urn—!— fr\^L f \\d~z a(3 e tN ^^ Za ^e-^ Tr ^ 
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IV. REPLICA SYMMETRIC SOLUTION 



The integrals in eq . ([19|) , (gTj) cannot be solved without resorting to an approximation. In 
analogy to what is used in |], §] , we use a saddle-point approximation (which in general would 
be valid in the limit M, N — > oo) and we assume replica symmetry || in the parameters 
{z a/ 3}, {Zap}- This allows to explicitely invert and diagonalize the matrices G,S: 

z aa = z {n); z a ^ p = z x (n)] 
iz aa = z {n); iz^p = -Zi{n); (22) 

The assumption of replica symmetry seems to have more subtle implications in the present 
situation. These will be discussed below. 



In replica symmetry the mutual information can be expressed as follows: 



n^o run 2 



]_ ( e N[(n+l)z^z^~n(n+l)z^z^-^(Tr\nG(z^,z^)+F(z^,z^))-^TrlnJ:(z^,z^)-H A (z^,z^)} 

in 2 I 
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(24) 
(25) 



F B (S ,5i) = -^ln 



X! ld^.M n+1 [p(^s)} n+1 L-E a A 5 ^~J)^^(^,s )/2a 



N 



(26) 



We have set r = ^ and z ' ,z ' ,z x ' ,z x ' are the solutions of the saddle point equations: 
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-Trln^z^zi) + H A ' H (z^~zi 



1 d 



l -Tr In E(So^i) + # A ' B (z , C 



ncfei 
— -[Tr In G(zb,«i) +F(ab,«i)]; 

~^a7 2~ [Tr ln G( * ' 2l) + F ^ ' Zl)] ' 



(27) 



All the equations must be evaluated in the limit n — > 0. It is easy to check that all terms 
in the exponent in eq.(p3|) are order n. In fact, since when n — > only one replica remains, 
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one has: 



YimTrhiG{z Q ,z 1 )+F{z ,z l ) = 0^ 

Tr In G(z ,z l )+F(z ,z 1 ) ~ n— [Tr In G(z , zi) + T(z ,2i)]u =0 - {21 



~A,B 



Therefore, from the saddle point equations, z ' are order n and Tr In E is also order n: 
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Tr In E ~ n—Tr In £| n=0 . 



(29) 
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it is easy to check by explicit evaluation that, when n — > 0, all the 



n + 1 diagonal terms among the matrix elements {5 a/ 3 — S~^} are order n and all the n{n + 1) 
out-of-diagonal terms are order 1. Then all terms in the exponent of eqs.(p5|),(p6D are order 
n, and we can expand the exponentials, which allows us to perform the quenched averages 
across {e, $ }. Considering the expression of 77(1?, s), eq.@, one obtains: 
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(31) 
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(33) 



A similar expansion in n for Tr In E(I ,5i) and for Tr In G(zo, ^i) + ^(^o, -^l) allows to 
derive explicitEly the saddle point equations: 
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a;? 
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e° 
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COO 
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v 7 ^ 



(34) 



where: 



erf(x) 



Dx' 



/x 
dx'a(x'); a(x) 
-oo 



^7T 



e 2 ; 



p= a| + Ca 2 (zb-zi); 9 = CffjZi. 



From the expression of z ' in eq. (|34J) , it is easy to verify that the dependence on z, 



xA,B 

o 



in eq . (|30|) , which might affect the information in eq. (|23|) , cancels out with the products 
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ZqZ ,Zq z which should contribute to the information in the limit n — > (see eq.(p3|)). 
Therefore, since z ' is known and z 1 ' depends only on^' , the mutual information can 
be expressed as a function of Z\ ' 1 z 1 ' , which in turn are to be determined self-consistently 
by the saddle point equations. 

The average information per input cell can be written, finally: 



^m^®s) = ^{^z?-~z?z? + r 



IM** z^-V^zt z?)} + rf (if ) - vi{zt)} 



(35) 



with 



ri(zb,zi) = -o r^ \ — -r + -lnperf —== 

Z°-ty/q\ 



Dt erf f-^zM ln 

\ Vp J 



erf 



v 7 ^ 



(36) 



1 



2=A\ s^„2 



r^(if ) = +- ln(l + 2^5f ) - z?{a 



AJ) 



Z-B 



r^(if ) = - ln(l + 2a 2 5f ) - zf(a 



K) 



1 + 2a 2 z? 



A 1 -A 2 



(37) 
(38) 



The expression for the mutual information only contains terms linear in either N or M. 
Since the last of the saddle-point equations, fl3"4|), contains r, if one fixes N and increases M 
the information grows non-linearly, because the position of the saddle point varies. It turns 
out that, as shown below, the growth is only very weakly sublinear, at least when M < N. 
Analogously, fixing M and varying N we would find a non-linearity due to the r-dependence 
of the saddle point. If r is fixed and N and M grow together, the information rises purely 
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linearly. 

What our analytical treatment misses out, however, is the nonlinearity required to appear 
as the mutual information approaches its ceiling, the entropy of the stimulus set. The 
approach to this saturating value was described at the input stage || |Hj], where also the 
initial linear rise (in N) was obtained in the large noise limit [2|, |§. Therefore, our saddle 
point method is in same sense similar to taking a large (input) noise limit, a — > oo, to its 
leading (order N/a 2 ) term. It is possible that the saddle point method could be extended, 
to account also for successive terms in a large noise expansion. This would probably require 
integrating out the fluctuations around the saddle point, but by carefully analysing the 
relation of different replicas to different values of the quenched variables. We leave this 
possible extension to future work. The present calculation, therefore, although employing a 
saddle point method which is usually applicable for large N and M, should be considered 
effectively as yielding the initial linear rise in the mutual information, the one observed with 
M small. 

V. NUMERICAL RESULTS 

Eq. fl34|) for z 1 ' has been solved numerically using a Matlab Code. Convergence to self- 
consistency has been found already after 50 iterations with an error lower than 10~ 10 . 

Fig.|l] shows the mutual information as a function of the output population size, for an 
input population size equal to 100 cells. This is contrasted with the information in the 
input units, about exactly the same set of correlates, calculated as in 0, by keeping only 
the leading (linear) term in N. In fact, in the mutual information carried by a finite 
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population of neurons firing according to eq. (|1|) had been evaluated analytically, in the limit 
of large noise, by means of an expansion in N(i] ) 2 /4a 2 . To linear order in N the analytical 
expression for the information carried by N input neurons reads: 



hnputttVi}, $ ® s) = ^^ (Aj - A 2 V ) ; (39) 



where A*, A 2 are defined, again, as in eqs. fl3l|) , fl32|). In analogy to what had been done 
in || we have set Co 2 — 1. As evident from the graph, also the output information is 
essentially linear up to a value of r ~ 0.5, and quasi-linear even for r = 1. It should be 
remined, again, that our saddle point method only takes into account the term linear in N 
in the information input units carry about the stimulus. It is not possible, therefore, for 
eq.fl3"5|) to reproduce the saturation in the mutual information as it approaches the entropy 
of the stimulus set (which is finite, if one considers only discrete stimuli). The nearly linear 
behaviour in M thus reflects the linear behaviour in N induced, in the intermediate quantity 
(the information available at the input stage), by our saddle point approximate evaluation. 
As it is clear from the comparison in Fig.|l|, when the two populations of units are affected 
by the same noise the input information is considerably higher than the output one. This is 
expected, since output and input noise sum up while influencing the firing of output neurons, 
but also because the input distribution is taken to be a pure gaussian, while the output rates 
are affected by a threshold. If the input-output tranformation were linear and the output 
noise much smaller than the input one, one would expect that output and input units would 
carry the same amount of information. 
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FIG. 1: Information rise, from eq.(^), as a function of the number M of output neurons. N = 100; 
K = 4; (t? ) 2 = 0.1; a = 0.2; £° = -0.4; a 2 = 1; m = 1; Co- 2 , = 1; erf = 1. The distribution g(e) 
in eq.(H) is just equal to 1/3 for each of the 3 allowed e values of 0, 1/2 and 1. The upper curve 
is the linear term in the input information, calculated as a function of N as in [|j with identical 
parameters. 

Briefly, in a linear network with zero output noise one has: 



P({d}\{Vj}) = IP(& -J2 c ij J HVj); 



(40) 



Considering eqs. (|TT|) , (|T|) , an effective expression for the distribution p({(,i}\$, s) can be 
obtained by direct integration of the 8 functions 8(£i — J2j c ijJijVj) y i a their integral repre- 
sentation, on {rjj}: 



PtttiMs) 



(2n) M detZ 



^-Eijte-fiW'))^ 1 )/ 2 ^-^^.*)). 



(41) 



|i(0,s) =J2 c ij J ijVj($,s) 



(42) 
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(43) 



This distribution is then used to evaluate both the equivocation, eg. flT3|) , and the entropy 



of the responses, eg. ( H) . We do not report the calculation, that is straightforward and 
analogous to the one reported in ||. The final result, which is valid for a finite population 
size M, and up to the linear approximation in M(^°) 2 /4o" 2 , is analogous to eg . (J39[) : 



Ilin({Zi},#®s) 



1 M 
In 2 2a 2 



(A 1 - A 2 



(44) 



Thus, we expect that taking the limits £° — > oo and r — > simultaneously in eg . (|35D , we 
should get to the same result: the output information should egual the input one when a 2 
grows large. 

From eg.(PoD it is easy to show that: 



1 M 
lim lim /({6i,i9® s) — In 



1 + 



2Ca] (Aj - A 2 
a 2 + 2CaW 



(45) 



When a 2 ^> a 2 ,A^,A 2 one obtains exactly the linear limit, eg.(|4]). We have verified this 
analytical limit by studying numerically the approach to the asymptotic value of the mutual 
information. FigfJ shows the dependence of output information on the output noise a 2 , for 
4 different choices of the (reciprocal of the) threshold, £°. A large value, £° = 10, implies 
linear output units. As expected, the output information, which always grows for decreasing 
values of the output noise, for £° = 10 approaches asymptotically the input information. For 
increasing values of the output noise, the information vanishes with a typical sigmoid curve, 
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FIG. 2: Output information, from eq.(|35|), as a function of the output noise <7 2 , for 4 different 
values of the output (reciprocal) threshold £°. Logarithmic scale. N = 100; K = 4; M = 10; 
(77 ) 2 = 0.1; a = 0.2; a 2 = 1; m=l; Caj = 1. The distribution g(e) in eq.(|) is just equal to 1/3 
for each of the 3 allowed e values of 0, 1/2 and 1. The dotted line represents the asymptotic value 
of the input information, eg. (p9|) , for N = 10. 



with its point of inflection when the output matches the input noise. 

We have then examined how the information in output (compared to the input) de- 
pends on the number K of discrete correlates and on the width of the tuning function (|3|), 
parametrized by m, with respect to the continuous correlate. Fig.[3] shows a comparison 
between input and output information for a sample of 10 cells, as a function of K. Both 
curves quickly reach an asymptotic value, obtained by setting K — > 00 in eq.(^) for A 2 
The relative information loss in output is roughly constant with K. A comparison is shown 
with the case where correlates are purely discrete, which is obtained by setting m = in 
eq.(||). The curves exhibit a similar behaviour, even if the rise with K is steeper, and the 
asymptotic values are higher. This may be surprising, but it is in fact a consequence of the 
specific model we have considered, eq.(^), where a unit has the same tuning curve to each 
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FIG. 3: Comparison between input and output information as a function of the number K of 
discrete correlates, for the case of continuous+discrete correlates (m = 1) or with purely discrete 
correlates (obtained by setting m = 0). In eq.(J35|) we have set N = 100; r = 0.1; £° = —0.4; 
(ri ) 2 = 0.1; a = 0.2; a 2 = 1; Caj = 1; <rf = 1. The distribution g(e) in eq.® is just equal to 1/3 
for each of the 3 allowed e values of 0, 1/2 and 1. 

of the discrete correlates, only varying its amplitude with respect to a value constant in the 
angle. As K —* oo, most of the mutual information is about the discrete correlates, and 
the tuning to the continuous dimension, present for m = 1, effectively adds noise to the 
discrimination among discrete cases, noise which is not present for m = 0. 

With respect to the continuous dimension, the selectivity of the input units can be in- 
creased by varying the power m of the cosine from (no selectivity) through 1 (very dis- 
tributed encoding, as for the discrete correlates) to higher values (progressively narrower 
tuning functions). Fig.[| reports the resulting behaviour of the information in input and in 
output, for the case K — 1 (only a continuous correlate) and K = 4 (continuous+discrete 
correlates). Increasing selectivity implies a "sparser" || representation of the angle, the con- 
tinuous variable, and hence less information, on average. However if the correlate is purely 
continuous there is an initial increase, before reaching the optimal sparseness. It should 
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FIG. 4: Comparison between input and output information as a function of the selectivity along the 
continuous dimension, which is made sharper by increasing m. K = 1 implies a purely continuous 
correlate, while the continuous+discrete case is obtained by setting K = 4. In eq.(|3q) we have set 
AT = 100; r = 0.1; £° = -0.4; (r? ) 2 = 0.1; a = 0.2; m = 1; Caj = 1; a] = a 2 = 1, in both cases. 
The distribution g(e) in eq.@ is just equal to 1/3 for each of the 3 allowed e values of 0, 1/2 and 
1. 

be kept in mind, again, that the asymptotic equality of the K = 1 and K = 4 cases is a 
consequence of the specific model, eq.(|]), which assigns the same preferred angle to each 
discrete correlate. The resolution with which the continuous dimension can be discriminated 
does not, within this model, improve with larger K, while the added contribution, of being 
able to discriminate among discrete correlates, decreases in relative importance as the tuning 
becomes sharper. 

Figures | and [| show that, as long as the output noise is non zero and the threshold 
is finite, information is lost going from input to output, but the information loss does not 
appear to depend on the structure and on the dimensionality of the correlate. 

Note that, while the purely continuous case has been easily obtained by setting K = 1 in 
the expression of A 2 , eq.fl32|), for the purely discrete case it is enough to set m = 0. 
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VI. DISCUSSION 

We have attempted to clarify how information about multi-dimensional stimuli, with 
both a continuous and a discrete dimension, is transmitted from a population of units with 
a known coding scheme, down to the next stage of processing. 

Previous studies had focused on the mutual information between input and output units 
in a two-layer threshold-linear network either with learning || or with simple random con- 
nection weights ||. 

More recent investigations have tried to quantify the efficiency of a population of units in 



coding a set of discrete || or continuous [jTOj correlates. The analysis in |9j has been then 



generalized to the more realistic case of multi-dimensional continuous+discrete correlates 

E- 

This work correlates with both research streams, in an effort to define a unique conceptual 
framework for population coding. The main difference with the second group of studies is 
obviously the presence of the network linking input to output units. The main difference 
with the first two papers, instead, is the analysis of a distinct mutual information quantity: 
not between input and output units, but between correlates ("stimuli") and output units. 
In [[| it had been argued, for a number K of purely discrete correlates, that the information 
about the stimuli reduces to the information about the "reference" neural activity when 
K —* oo. The reference activity is simply the mean response to a given stimulus when the 
information is measured from the variable, noisy responses around that means; or it can be 
taken to be the stored pattern of activity, when the retrieval of such patterns is considered, 
as in ||. True, the information about the stimuli saturates at the entropy of the stimulus 
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set, but for K — > oo this entropy diverges, only the linear term in N is relevant ||, and the 
two quantities, information about the stimuli and information about the reference activity, 
coincide. 

Our present saddle point calculation is only able to capture, effectively, the mutual infor- 
mation which is linear in the number of input units, as mentioned above. It fails to describe 
the approach to the saturating value, the entropy of the set of correlates, be this finite or 
infinite. Therefore, ours is close to a calculation of the information about a reference activity 
- in our case, the activity of the input units. The remaining difference is that we can take 
into account, albeit solely in the linear term, the dependence on K (through the equation 
for Aj!, eq.fl32|)), without having to take the further limit K — » oo. 

Due to the presence of a threshold and of a non zero output noise the information in 
output is lower than that in input, and we have shown analytically that in the limit of a 
noiseless, linear input-output transfer function the ouptput information tends asymptotically 
to the input one. We have not, however, introduced a threshold in the input units, which 
would be necessary for a fair comparison. In an independent line of research, recent work 



II]] has also quantified the contribution to the mutual information, in a different model, of 
cubic and higher order non-linearities in the transfer function, by means of a diagrammatic 
expansion in a noise parameter. In it has been shown that the effect of a threshold in 
the input units on the input information results merely in a renormalization of the noise. 
The resulting effect on the output information remains to be explored, possibly with similar 
methods. 

Considering mixed continuous and discrete dimensions in our stimulus set, we had been 
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wondering whether the information loss in output depended on the presence or absence of 
discrete or continuous dimensions in the stimulus structure. We have shown that for a fixed, 
finite level of noise this loss dose not depend significantly on the structure of the stimulus, 
but solely on the relative magnitude of input and output noise, and on the position of the 
output threshold. 

Further developments of this analysis include the evaluation of the output information in 
presence of learning, in line with || , and with correlations in the firing of input units. 

A recent work has shown that the interplay between short and long range connectivities 
in the Hopfield model leads to a deformation of the phase diagram with the appearence of 



novel phases |E|. It would be interesting to introduce short and long range connections 
in our model, and to examine how the coding efficiency of output neurons depends on the 
interaction between short and long range connections. This will be the object of future 
investigations. 
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